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ABSTRACT 

We present a novel automated methodology to detect and classify periodic variable 
stars in a large database of photometric time series. The methods are based on mul- 
tivariate Bayesian statistics and use a multi-stage approach. We applied our method 
to the ground-based data of the TrES Lyrl field, which is also observed by the Kepler 
satellite, covering ~ 26 000 stars. We found many eclipsing binaries as well as classical 
non-radial pulsators, such as slowly pulsating B stars, 7 Doradus, /3 Cephei and 5 
Scuti stars. Also a few classical radial pulsators were found. 

Key words: Star: variable; Techniques: photometric; Methods: statistical; Methods: 
data analysis 



1 INTRODUCTION 

In recent years there has been a rapid progress in astro- 
nomical instrumentation giving us an enormous amount 
of new time-resolved photometric data, resulting in large 
databases. These databases contain many light curves of 
variable stars, both of known and unknown nature. Well- 
known e xamples are the large databases res ulting from the 
C0R0 T (|Fridlund et al.l f2006') and Kepler i|Gilliland et all 
|2010| ) space missions, containing respectively ~ 100 000 and 
~ 150 000 light curves so far. The ESA Gaia mission, ex- 
pected to be launched in 2012, will monitor about one billion 
stars during five years. Besides the space missions, also large 
scale photometric monitoring of stars with ground-based au- 
tomated telescopes deliver large numbers of light curves. The 
challenging task of a fast and automated detection and clas- 
sification of new variable stars is therefore a necessary first 
step in order to make them available for further research 
and to study their group properties. 

Several efforts have already been made to detect and 
classify variable stars. In the framework of the C0R0T 



mission, a procedure for fast light curve analysis and 
derivation of cla s sificat ion parameters was developed by 
iDebosscher et all l|2007t h That algorithm searches for a fixed 
number of frequencies and overtones, giving the same set 
of parameters for each star. The v ariable stars were then 
classi fied using a Gaussian classifier (IDebosscher et a] ]|2007l . 
2009) and a Bayesian network classifier ( Sarro et al. 2009() . 

In this paper we present a new version of this method 
to detect and classify periodic variable stars. In contrast 
to the previous versions, the new automated methodology 
only uses significant frequencies and overtones to classify 
the variables with it giving less rise to confusion, especially 
when dealing with ground-based data. In order to be able 
to deal with a variable number of parameters, we also in- 
troduce a novel multi-stage approach. This new methohol- 
ogy offers much more flexibility. We applied this method to 
the ground-based photometric data of the TrES Lyrl field, 
covering about ~ 26 000 stars. The classification algorithm 
considers various classes of non-radial pulsators, such as f3 
Cep, slowly pulsating B (SPB) stars, 5 Set and 7 Dor stars, 
as well as classical radial pulsators (Cepheids, RR Lyr) and 
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eclipsing binaries (see, e.g.. lAerts et al.l l|2010l ') for a defini- 
tion of the classes of these pulsators) . 



2 A NEW METHODOLOGY 

2.1 Variability detection 

To detect and extract the variables we performed an auto- 
mated frequency analysis on all time series. The algorithm 
first checks for a possible polynomial trend up to order 2 
and subtracts it, as it can have a large detrimental influence 
on the frequency spectrum through aliasing. The order of 
the trend was determined using a classical likelihood-ratio 
test. Although the coefficients of the trend are recomputed 
each time a new oscillation frequency is added to the fit, the 
order of the trend remains fixed. 

After detrending, the algorithm searches for significant 
frequencies and overtones in the residuals, using Fourier 
analysis. The algorithm searches for the frequency with the 
highest amplitude in the discrete Fourier transform and 
checks if this period is signifi c ant, u si ng the false alarm prob- 
ability l|Horne fe Baliunasl |l986l) . ISchwarzenberg-Czernvl 
(1998)). Note that a detected frequency peak can be sig- 
nificant but unreliable. Reliability is checked through pre- 
specified frequency intervals that are not trustworthy (e.g. 
around multiples of 1 c/d for ground-based data). Unre- 
liable frequencies are prewhitened, but flagged as "unreli- 
able" and are not used for classification. If the frequency 
is the first significant reliable frequency, then the algorithm 
checks whether half of this frequency is also significant and 
reliable. In this case, the original new frequency is replaced 
with half of this frequency to better model the binary light 
curves. In a next step, the algorithm searches for significant 
overtones, using the likelihood-ratio test, to model possible 
non-sinusoidal variations (like those of RR Lyr stars) . This 
procedure is repeated as long as significant frequencies are 
found. These frequencies v n can be used to make a harmonic 
best fit to the light curve of the form: 

K 

f(t) = Y / a i (t-to) i 

i=0 

N M ^) 

n— m— 1 

+c n ,m cos (2nv n m(t — to)), 

with ^ K ^ 2 the order of the trend, and N, the number 
of significant frequencies, determined using the false alarm 
probability and M ^ 1 the number of harmonics, deter- 
mined using the likelihood-ratio test. 

The frequ en cy analysis method used by 
iDebosscher et all (|2007l ) performs well on properly- 
reduced satellite data for which it was designed, but not on 
noisier ground-based data, as many insignificant frequencies 
and overtones can degrade the performance of the classifier. 

2.2 The classifier 

The aim of supervised classification is to assign to each vari- 
able target a probability that it belongs to a particular pre- 
defined variability class, given a set of observed parameters. 
This set of parameters (also called attributes) is obtained 



from the variability detection pipeline described above and 
contains frequencies, amplitudes and phase differences. The 
classifier relies on a set of known examples, the so-called 
training set, of each class that needs to represent well the 
entire variability class. 

We used a novel multi-stage approach, where the clas- 
sification problem is divided into several sequential steps. 
This classifier partitions the set of given variability classes 
d, into two or more parts: C (1) , C (2) , .... This simplifies the 
classification by degrading the level of detail to a smaller 
number of categories. Each of these partitions which 
can contain several variability classes, is then again splitted 
into C^' 1 ', C (i,2) , . . . , which in turn can be partitioned into 
C' 1 ' 3,1 ', C^' 2 \ anc l so ori) each time specializing the classi- 
fication until each subpartition contains only one variability 
class. These partitions can be represented in a tree. 

This approach offers several advantages compared to a 
single-stage classifier. The main advantage of this approach 
is that in each stage a different classifier and a different set of 
attributes can be used. This is important as attributes car- 
rying useful information for the separation of two classes can 
be useless or even harmful for distinguishing other classes. In 
each stage, informationless attributes for the separation of 
the classes of interest can be removed, thereby significantly 
reducing possible confusion. In addition, it is also possible 
to have a variable number of attributes. This allows to make 
different branches for mono- versus multi-periodic pulsators. 
In this way we do not need a fixed set of attributes, thereby 
avoiding the introduction of spurious freq uencies or over- 
tones, which was sometimes the case in IDebosscher et all 
(2007). As already mentioned earlier, this too is important 
as insignificant attributes can degrade the performance of 
the classifier. 

We took each of the classifier nodes in the multi-stage 
tree as a Gaussian mixture classifier. The Gaussian mixture 
classifier is based on the general law of Bayes: 

P(C = c i \A = a) = ^ = °|C = «)^ = *) , (2) 
Y^L(A = a\C = c r )P{C = a) 

i=l 

with N c the number of different classes. These classes can 
correspond to the variability classes (e.g. /3 Cep, SPB,...), 
but as the Gaussian mixture classifier is used at the nodes 
in the multi-stage classifier, a class in this context may also 
correspond to a group of variability classes relevant for a par- 
ticular node. P(C = d\A = a) is the a posteriori probability 
of the target belonging to class a given the observational 
evidence a, and is the goal of the classification problem. 
L(A — a\C — Ci) is the conditional likelihood of a attribute 
set a given that it belongs to variability class Ci. P(C = Cj) 
is the a priori probability of a target belonging to class d. 
As no reliable prior values for variability classes are known 
yet, we used a uniform prior. 

In previous versions of the classifier, the likelihood was 
approximated as a single Gaussian. Some of the variability 
classes, however, are not well modeled by a single Gaussian. 
An example of this is shown in Fig. [1] in which multiple 
components are clearly preferable. 

The likelihood is now approximated as a finite sum of 
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Figure 1. Gaussian mixture in the 2-parameter space (log (u\), log (a)) for the Classical Cepheids in the training set estimated by the 
Expectation-Maximization algorithm (EM). 



multivariate Gaussians: 

Mi 

L(A = a\C = a) = ^2 a k (j) k (a\fi k , E fc ), (3) 
fc=i 

where 

^(a| Mfc ,E fc ) = — — — 

I 1 , \ \ ^ 

expl-~(a-fj,k)''£ k (o-jttfe)J , 

with Mi the finite number of Gaussian components of class 
Cj, N a the number of attributes, a k the a priori probability 
to belong to component k, and \i k and T, k , the mean vector 
and covariance matrix of Gaussian component k. 

For each node of the multi-stage tree, the set of vari- 
ability classes is partitioned. The best attributes are selected 
for that node and the classifier is trained, meaning that the 
Gaussian mixture for each class is determined. To do so, we 
used the Expectation - Maxi mization (EM) method (see e.g. 
iGamerman fc Migonl (|l993h ). Given a variability class, the 
unknowns are the number of Gaussian components of each 
class, the prior probability to belong to a particular compo- 
nent, and the mean vectors fi k and covariance matrices T, k of 
each component. The EM algorithm is an iterative method 
for calculating maximum likelihood estimates of parameters 
in probabilistic models, where the model depends on unob- 
served latent variables. EM alternates between performing 



an expectation (E) step, which computes the expectation 
of the log-likelihood evaluated by using the current esti- 
mate for the latent variables, and a maximization (M) step, 
which computes parameters maximizing the expected log- 
likelihood found in the E step. These parameter-estimates 
are then used to determine the distribution of the latent 
variables in the next E step. Given the number of Gaussian 
components N c , the remaining unknowns in the model can 
be determined by using this procedure. The actual number 
of components is determined using the Bayesian informa- 
tion criterion (BIC), which is a criterion for model selection 
among a set of parametric models with different number of 
parameters. We obtained three components in the example 
in Fig.[T]using BIC. The Akaike information criterium (AIC) 
gives the same number of components. This solution turns 
out to be very stable when changing initial values, in the 
sense that the EM algorithm always converges to the same 
solution. 



2.3 Automated classification 

Once the classifiers in each node are trained, the targets 
can be classified. In each node we assign a probability to 
each target that it belongs to a particular class relevant for 
that node. In order to obtain the final probability for each 
variability class we multiply the probabilities along the cor- 
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responding root-to-leaf path using the chain rule of condi- 
tional probability. Let c( fc »3»— »i.n,m) ^ e tne subpartition that 
contains only class C\. The probability that the target T 
belongs to d is thus given by: 

P(T€C i \{A i }) = 

(5) 

p(Q(k,j,...,n,m)ig(k,j,...,n)\ p(Q{k,j) \Q( k )\p(Q( k )) 

where we dropped "T £" and the observed attributes {Ai} 
in the right-hand side of the equation, for the sake of nota- 
tional simplicity. We retain the most probable class assign- 
ment for a given variable star of unknown type and label it 
according to the Mahalanobis distance. 

Note that the denominator in Eq. ([2| enforces the target 
to belong to one of the predefined classes, although the tar- 
get can be very far from the class centers in attribute space. 
For that reason it is important to include a n outlier detection 
step t o flag possible wrong predictions. iDebosscher et aD 
(2009) approximated a training class with a single Gaus- 
sian, and computed the Mahalanobis distance of a target 
to the center of the class as an outlier indicator. For the 
multi-stage approach with multi-dimensional Gaussians, we 
use the following extension of the Mahalanobis distance: 

d — (a — Ji)'S (a — Jl), (6) 

with a the attribute vector of the target, and ~p the center 
of mass of the Gaussian mixture. The total variance £ is 
defined as the sum of the intra-component variances and 
the inter-component variance: 




where Hk is the mean vector of each of the N c Gaussian 
components. If, and only if, the distance is above a certain 
threshold, the outlier flag will be set to indicate that the tar- 
get does not seem to belong to any of the predefined classes. 
This distance is a multi-dimensional generalisation of the 
one-dimensional statistical distance (e.g. distance to a mean 
value of a Gaussian in terms of the standard deviation) . For 
this reason, a value of the distance threshold d=3 is chosen. 

2.4 Training the classifier 

In order to train the classifier, we computed the attributes of 
the training set objects, which were taken from Hipparcos, 
OGLE and CoRoT, with the variability detection pipeline, 
described in section 12.11 We only computed up to 2 signifi- 
cant frequencies with each up to 3 harmonics, which in our 
experience is sufficient for classification purposes. Since the 
quality of the classification results depends crucially on the 
quality of the training set, we checked all the light curves 
and phase plots in this set. The variability classes we took 
into account are listed in Table [T] We carefully set up the 
multi-stage tree, which is give n in Fig. [2] Applyin g cluster- 
ing techniques on CoRoT data, ISarro et all ( 2009) managed 
to identify new classes. In view of the Kepler mission, two of 
these classes, stars with activity and variables due to rota- 
tional modulation, are taken into account in the multi-stage 
tr ee. A detailed desc ription of these two classes can be found 
in IDebosscher et al l (|2010h . 

In each node, we manually selected the best attributes 



Table 1. The variability classes taken into account in 
the multi-stage tree, with the number of light curves 
(NLC) used to define the classes. 



Class NLC 



Eclipsing binaries (ECL) 790 

Ellipsoidal (ELL) 35 

Classical cepheids (CLCEP) 170 

Double-mode cepheids (DMCEP) 79 

RR-Lyr stars, subtype ab (RRAB) 70 

RR-Lyr stars, subtype c (RRC) 21 

RR-Lyr stars, subtype d (RRD) 52 

P Cep stars (BCEP) 28 

<5 Set stars (DSCUT) 86 

Slowly pulsating B stars (SPB) 91 

7 Dor stars (GDOR) 33 

Mira variables (MIRA) 136 

Semi-regular (SR) 103 

Activity (ACT) 51 

Rotational Modulation (ROT) 26 



to distinguish the classes considered in that node. In or- 
der to evaluate the significance of an attribute we measured 
the i nformation gain and g ain ratio with respect to each 
class (|Witten fc Frankll2005l ). Based on these results we se- 
lected the best attributes in terms of highest information 
gain and gain ratio, that make sense from an astrophysi- 
cal point of view. In practice 'random' attributes can show 
structure, even if they are not supposed to. The attributes 
we know by theory that should be random variables were 
excluded in order to avoid overfitting. In each node the clas- 
sifier wa s then tested using strat ified 10-fold cross-validation 
(see e.g. IWitten fc Frank! lj2005T l). In stratified n-fold cross- 
validation, the original sample is randomly partitioned into 
n subsamples. Of the n subsamples, a single one is retained 
as the validation data for testing the model, and the remain- 
ing n — 1 subsamples are used as training data. The cross- 
validation process is then repeated n times (the folds), with 
each of the n subsamples used exactly once as the valida- 
tion data. Then the n results from the folds are combined to 
produce a single estimation. Each fold contains roughly the 
same proportions of the class labels. We kept the attributes 
giving the best classification results, not only in terms of cor- 
rectly classified targets, but also in terms of accuracy mea- 
sured by the area under the ROC curve ()Witten fc Frank! 
120051 ). The higher the area under the ROC curve, the better 
the test. 

Stratified 10-fold cross-validation was also applied on 
the multi-stage tree as a whole. When only the first fre- 
quency and its main amplitude are available, poor results 
are obtained, because there is simply too little informa- 
tion available for classification. When we leave out those 
examples and only use the training examples for which we 
have more information, very good results are obtained as 
can be seen in Table [5] Only 5.8% of the training exam- 
ples is wrongly classified. When we replace the variability 
classes models by single Gaussians, we have a worse result 
with 7.3% of wrong predictions (see Table [3}. When we then 
also use only one stage, 10.7% of the training examples is 
misclassified (see Table [4}. We can thus conclude that our 
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Figure 2. Multi-stage decomposition. The subtree represented in the S box is not replicated for simplicity. 



Table 2. The confusion matrix for the multi-stage tree applied on the training set objects with at least 2 harmonics for the 
first frequency. Each stellar variability class in each node is modelled by a finite sum of multivariate Gaussians. The last 
line lists the correct classification (CC) for every class separately. The average correct classification is 94.2%. 
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multi-stage classification tree with Gaussian mixtures at its 
nodes, is a significant improvement. 

3 APPLICATION TO TRES DATA 
3.1 The TrES Lyrl dataset 

We analyzed 25 947 light curves in the TrES Lyrl field. 
TrES, the Trans-atlantic Exoplanet Survey, is a network 
of three ten-centimeter optical telescopes searching the sky 



for transiting planets l|Alonso et al.ll2007l ; IO'Donovanll200cf ) . 
This network consisted of Sleuth (Palomar Observatory, 
Southern California) , the PSST (Lowell Observatory, North- 
ern Arizona) and STARE (Observatorio del Teide, Canary 
Islands, Spain), as TrES now excludes Sleuth and STARE, 
but includes WATTS. The TrES Lyrl field is a 5.7° x 5.7° 
field, cent ered on the star 1 6 Lyr and is part of the Ke- 
pler field (|Alonso et al.ll2007h . Most light curves have about 
15 000 observations spread with a total time span of aproxi- 
mately 75 days. A small fraction has less than 5 000 observa- 
tions with a total time span of around 62 days. Observations 
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Table 3. The confusion matrix for the multi-stage tree applied on the training set objects with at least 2 harmonics for the 
first frequency. Each stellar variability class in each node is modelled by a single Gaussian. The last line lists the correct 
classification (CC) for every class separately. The average correct classification is 92.7%. 
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Table 4. The confusion matrix for a single-stage classifier applied on the training set objects with at least 2 harmonics for 
the first frequency. Each stellar variability class is modelled by a single Gaussian. The last line lists the correct classification 
(CC) for every class separately. The average correct classification is 89.3%. 
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are given in either the Sloan r (Sleuth) or the Kron-Cousins 
R magnitude (PSST) and the mean R magnitude ranges 
from 9.2 to 16.3. 

3.2 Classification of variable stars 

3.2.1 Results of the variability detection 

With the use of the variability detection algorithm, de- 
scribed in section 12.11 we searched for frequencies in the 
range 3/Ttot to 50 c/d, with T to t the total timespan of 
the observations in days. In order to avoid the problem of 
daily aliasing in an automated way, small frequency intervals 
around multiples of 1 c/d were flagged as "unreliable" . Using 
a false alarm probability of a = 0.005 (the null-hypothesis of 
only having noise in the light curves is rejected when P < a, 
with P the probability of finding such a peak in the power 
spectrum of a time series that only contains noise.), about 
18 000 objects were found non-constant. The stars for which 
we could not find significant frequencies were used to deter- 
mine the RMS level of the time series as a function of the 
mean magnitude, which is plotted in Fig. [3] indicating to 
what level we can detect variability. The upward trend can 
be explained in terms of photon noise. 

3.2.2 Classification results 

We used the multi-stage tree presented in section [2~4l where 
we excluded the stars with activity and variables with ro- 



tational modulation. As already mentioned earlier, these 
classes were included in the multi-stage tree in view of the 
Kepler mission. However, we do not expect to find good 
candidates in the ground-based data of TrES Lyrl as these 
classes are characterized by low amplitudes. The classifica- 
tion algorithm was able to detect many good candidate class 
members. By candidate we mean a target belonging to the 
class with the highest class probability above 2 different cut- 
off values Pmin'- 0.5 and 0.75 and with a generalized Maha- 
lanobis distance d < 3 to that class. A quick visual check of 
the light curves and phase plots of the targets with a dis- 
tance above 3 showed that a large fraction of light curves 
suffers from instrumental effects. The results of the classifi- 
cation are listed in Table [5] 

As with CoRoT, the main objective for TrES was the 
search for planets. We do not find many Long Period Vari- 
ables (LPV), Cepheids and RR Lyr among its targets. The 
total time span of the light curves is also too short to be 
able to detect Mira type variables. 



3.2.3 Eclipsing binaries and ellipsoidal variables 

Irrespective of the observed field on the sky, we should al- 
ways find a number of eclipsing binaries and ellipsoidal vari- 
ables. Light curves of eclipsing binaries are very different 
from those of pulsating stars and therefore generally well 
separated using the phase differences between the first 3 
harmonics of the first frequency. Most detected candidate 
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Figure 3. The RMS of the time series plotted as a function of the mean magnitude for stars having no significant frequencies and no 
trend. 



Table 5. Overview of the classification results using 2 different cutoff 
values for the highest class probability p. A generalized Mahalanobis 
distance d < 3 to the most probable class is taken as defined in Eq. J5J. 



Class(es) 


p > 0.5 


p > 0.75 


Eclipsing binaries (ECL) 


158 


130 


Ellipsoidal (ELL) 


571 


214 


Classical cepheids (CLCEP) 


3 


2 


Double-mode cepheids (DMCEP) 








RR-Lyr stars, subtype ab (RRAB) 


2 


2 


RR-Lyr stars, subtype c (RRC) 


1 


4 


RR-Lyr stars, subtype d (RRD) 








P Cep or 5 Set stars (BCEP/DSCUT) 


842 


780 


SPB or 7 Dor stars (SPB/GDOR) 


914 


496 


Mira variables (MIRA) 








Semi-regular (SR) 


8 


5 



binaries have therefore a very high probability (> 90%) of 
belonging to the ECL class. We found about 158 reliable 
eclipsing binaries. Some good examples of eclipsing binary 
light curves are shown in Fig. [4] It is remarkable that, al- 
though eclipses are not always easily seen in the light curve, 
they clearly show up in the phase plot and are detected by 
the classification algorithm. 



3.2.4 Monoperiodic pulsators 



Despite the fact that Cepheids and RR Lyr are easy to dis- 
tinguish from other classes due to their large amplitudes, 
almost no good candidates were found. Examples of the few 
candidates found, are shown in Fig. [5] 
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Figure 4. Panels on the left-hand side show a sample of TrES Lyrl time series of eclipsing binaries. On the right: the corresponding 
phase plots, made with the detected frequency. 



3.2.5 Multiperiodic pulsators 

As no colour information was available, confusion between 
P Cep and 5 Set stars occurs, because of overlapping fre- 
quency ranges. For this reason we merged these 2 classes 
into a single class. It is possible that, for the same target, 
these classes have similar probabilities below 0.5, but add up 
to a value well above 0.5. Similarly, we could often not make 
a clear distinction between 7 Dor and SPB stars, because 
they show similar gravity-mode spectra. This problem may 
be solved by adding supplementary information like tem- 
perature, colours or a spectrum, not only for the targets but 
also for the training sets. Although frequencies around mul- 



tiples of lc/d have been set unreliable, especially the 7 Dor 
and SPB classes suffer from the combination of daily aliasing 
and instrumental effects. For this class, a visual inspection 
of the light curves and phase plots was needed. Fig. [S] shows 
some good examples of non-radial pulsators. 

3.3 Discussion and conclusions 

In contrast to previous classi fication methods for tim e se- 
ries of photometric data (e.g. Dcbossc her et all (|2007l )l we 
now only use significant frequencies and overtones as at- 
tributes, giving less rise to confusion. We are able to statis- 
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Figure 5. Panels on the left-hand side show a sample of TrES Lyrl time series of radial pulsators. On the right: the corresponding phase 
plots, made with the detected frequency. 



tically deal with a variable number of attributes using the 
multi-stage approach developed here. Another advantage of 
this approach is that the conditional probabilities in each 
node can be simplified by dropping one or more attributes 
that are not relevant for a particular node. Moreover, in 
each node, a different classifier can be chosen. In this paper 
we only used the Gaussian Mixture classifier, but also other 
methods like, e.g., Bayesian Nets can be used, which gives 
more flexibility. Finally, the variability classes were better 
described by a finite sum of multivariate Gaussians. 

We applied our methods to the ground-based data of 
the TrES Lyrl field, which is also observed by the Kepler 



satellite. We found non-radial pulsators such as f3 Cep stars, 
8 Set stars, SPB stars, and 7 Dor stars. Because of lack of 
precise and dereddened information, and because of overlap 
in frequency range we could, however, sometimes not avoid 
confusion between j3 Cep and S Set stars, on one hand, and 
between SPB and 7 Dor stars on the other hand. Besides 
non-radial pulsators we also mention the detection of binary 
stars and some classical radial pulsators. The results of this 
classification will be made available through electronic ta- 
bles. 
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Figure 6. On the left: some TrES Lyrl light curves of non-radial pulsators. On the right: the corresponding phase plots, made with the 
detected frequency. 
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